Goto

Collaborating Authors

 relative order




Augmented Relevance Datasets with Fine-Tuned Small LLMs

arXiv.org Artificial Intelligence

Building high-quality datasets and labeling query-document relevance are essential yet resource-intensive tasks, requiring detailed guidelines and substantial effort from human annotators. This paper explores the use of small, fine-tuned large language models (LLMs) to automate relevance assessment, with a focus on improving ranking models' performance by augmenting their training dataset. We fine-tuned small LLMs to enhance relevance assessments, thereby improving dataset creation quality for downstream ranking model training. Our experiments demonstrate that these fine-tuned small LLMs not only outperform certain closed source models on our dataset but also lead to substantial improvements in ranking model performance. These results highlight the potential of leveraging small LLMs for efficient and scalable dataset augmentation, providing a practical solution for search engine optimization.


Space reduction techniques for the $3$-wise Kemeny problem

arXiv.org Artificial Intelligence

Kemeny's rule is one of the most studied and well-known voting schemes with various important applications in computational social choice and biology. Recently, Kemeny's rule was generalized via a set-wise approach by Gilbert et. al. This paradigm presents interesting advantages in comparison with Kemeny's rule since not only pairwise comparisons but also the discordance between the winners of subsets of three alternatives are also taken into account in the definition of the $3$-wise Kendall-tau distance between two rankings. In spite of the NP-hardness of the 3-wise Kemeny problem which consists of computing the set of $3$-wise consensus rankings, namely rankings whose total $3$-wise Kendall-tau distance to a given voting profile is minimized, we establish in this paper several generalizations of the Major Order Theorems, as obtained by Milosz and Hamel for Kemeny's rule, for the $3$-wise Kemeny voting schemes to achieve a substantial search space reduction by efficiently determining in polynomial time the relative orders of pairs of alternatives. Essentially, our theorems quantify precisely the nontrivial property that if the preference for an alternative over another one in an election is strong enough, not only in the head-to-head competition but even when taking into account one or two more alternatives, then the relative order of these two alternatives in all $3$-wise consensus rankings must be as expected. As an application, we also obtain an improvement of the Major Order Theorems for Kememy's rule. Moreover, we show that the well-known $3/4$-majority rule of Betzler et al. for Kemeny's rule is only valid in general for elections with no more than $5$ alternatives with respect to the $3$-wise Kemeny scheme. Several simulations and tests of our algorithms on real-world and uniform data are provided.


OPP-Miner: Order-preserving sequential pattern mining

arXiv.org Artificial Intelligence

A time series is a collection of measurements in chronological order. Discovering patterns from time series is useful in many domains, such as stock analysis, disease detection, and weather forecast. To discover patterns, existing methods often convert time series data into another form, such as nominal/symbolic format, to reduce dimensionality, which inevitably deviates the data values. Moreover, existing methods mainly neglect the order relationships between time series values. To tackle these issues, inspired by order-preserving matching, this paper proposes an Order-Preserving sequential Pattern (OPP) mining method, which represents patterns based on the order relationships of the time series data. An inherent advantage of such representation is that the trend of a time series can be represented by the relative order of the values underneath the time series data. To obtain frequent trends in time series, we propose the OPP-Miner algorithm to mine patterns with the same trend (sub-sequences with the same relative order). OPP-Miner employs the filtration and verification strategies to calculate the support and uses pattern fusion strategy to generate candidate patterns. To compress the result set, we also study finding the maximal OPPs. Experiments validate that OPP-Miner is not only efficient and scalable but can also discover similar sub-sequences in time series. In addition, case studies show that our algorithms have high utility in analyzing the COVID-19 epidemic by identifying critical trends and improve the clustering performance.


STaCK: Sentence Ordering with Temporal Commonsense Knowledge

arXiv.org Artificial Intelligence

Sentence order prediction is the task of finding the correct order of sentences in a randomly ordered document. Correctly ordering the sentences requires an understanding of coherence with respect to the chronological sequence of events described in the text. Document-level contextual understanding and commonsense knowledge centered around these events are often essential in uncovering this coherence and predicting the exact chronological order. In this paper, we introduce STaCK -- a framework based on graph neural networks and temporal commonsense knowledge to model global information and predict the relative order of sentences. Our graph network accumulates temporal evidence using knowledge of `past' and `future' and formulates sentence ordering as a constrained edge classification problem. We report results on five different datasets, and empirically show that the proposed method is naturally suitable for order prediction. The implementation of this work is publicly available at: https://github.com/declare-lab/sentence-ordering.